Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition

نویسندگان

Yanhua Cheng

Xin Zhao

Rui Cai

Zhiwei Li

Kaiqi Huang

Yong Rui

چکیده

This paper studies the problem of RGB-D object recognition. Inspired by the great success of deep convolutional neural networks (DCNN) in AI, researchers have tried to apply it to improve the performance of RGB-D object recognition. However, DCNN always requires a large-scale annotated dataset to supervise its training. Manually labeling such a large RGB-D dataset is expensive and time consuming, which prevents DCNN from quickly promoting this research area. To address this problem, we propose a semi-supervised multimodal deep learning framework to train DCNN effectively based on very limited labeled data and massive unlabeled data. The core of our framework is a novel diversity preserving co-training algorithm, which can successfully guide DCNN to learn from the unlabeled RGB-D data by making full use of the complementary cues of the RGB and depth data in object representation. Experiments on the benchmark RGB-D dataset demonstrate that, with only 5% labeled training data, our approach achieves competitive performance for object recognition compared with those state-of-the-art results reported by fully-supervised methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Deep Representations, Embeddings and Codes from the Pixel Level of Natural and Medical Images

Significant research has gone into engineering representations that can identify high-level semantic structure in images, such as objects, people, events and scenes. Recently there has been a shift towards learning representations of images either on top of dense features or directly from the pixel level. These features are often learned in hierarchies using large amounts of unlabeled data with...

متن کامل

Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery

Recent advances in computer vision using deep learning with RGB imagery (e.g., object recognition and detection) have been made possible thanks to the development of large annotated RGB image datasets. In contrast, multispectral image (MSI) and hyperspectral image (HSI) datasets contain far fewer labeled images, in part due to the wide variety of sensors used. These annotations are especially l...

متن کامل

Semi-supervised Multimodal Learning with Deep Generative Models

In recent years, deep neural networks are used mainly as discriminators of multimodal learning. We should have large amounts of labeled data for training them, but obtaining such data is difficult because it requires much labor to label inputs. Therefore, semi-supervised learning, which improves the discriminator performance using unlabeled data, is important. Among semi-supervised learning, me...

متن کامل

Improved RGB-D-T based face recognition

Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This paper combi...

متن کامل

Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition

In this paper, we propose a correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modalspecific ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition

نویسندگان

چکیده

منابع مشابه

Learning Deep Representations, Embeddings and Codes from the Pixel Level of Natural and Medical Images

Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery

Semi-supervised Multimodal Learning with Deep Generative Models

Improved RGB-D-T based face recognition

Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition

عنوان ژورنال:

اشتراک گذاری